The population (dataset of interest) is 24 mice, 12 of each sex and 12 of each POE, with the same genetic composition as a result of crossing. Genes where at least three samples do not have 10 count were removed. Genes without at least one count for both alleles were removed. Genes with a significant sex, parent or interaction effect was removed.
Our sample size is 4 vs. 4. We select 8 subjects from our dataset, 4 for each sex and 4 for each POE. For the (i,j)th gene-sample in our real (sub)dataset, we simulate \(\text{LOGIT}_{ij} = \log\left(\frac{p}{1-p}\right)_{ij} = \alpha_i+\beta_i x\), where \(\alpha_i\) is the mean proportion of reads mapped to allele 1 for the ith gene, \(\beta_i \sim N(0,1)\) and \(x = (0,1,0,...,1)\). Then \(p_{ij} = \frac{1}{1+\exp(-\text{LOGIT}_{ij})}\). We then simulate \(Y_{ij}\) from \(\text{Betabin}(p_{ij}, n_{ij}, \phi_i)\) where \(n_{ij}\) is the total reads mapping to both alleles for gene-sample (i,j) and \(\phi_i\) is the overdispersion MLE of the ith gene. These will be our simulated allele counts.
MAP successfully shrinks erroneously large estimates and improves overall estimation on average. MAP mainly effects estimates with smaller counts. Ash has more extreme shrinkage, and has both more accurate predictions on average and higher concordance with the true top gene list.
We define shrinkage as movement from MLE to zero. We define an estimate as shrunk if the estimate is at least 0.1 closer to zero than the MLE. We can see that ash is shinking more frequently and more severely, on average.
## number of shrunk (shrinkage>0.1) apeglm estimates
## [1] 1038
## number of shrunk (shrinkage>0.1) ash estimates
## [1] 1094
## quantiles of shrinkage for apeglm (top) and ash (bottom)
## 50% 75% 90% 95% 97.5% 99%
## 0.01096110 0.02926107 0.10540487 0.20808906 0.31866782 0.47575610
## 99.5%
## 0.66355286
## 50% 75% 90% 95% 97.5% 99%
## 0.009314443 0.032127520 0.113304536 0.230402186 0.378838910 0.677579480
## 99.5%
## 1.029786164
We define improvement as movement from MLE to truth. The plots show the extent and distribution of shrinkage and improvement for apeglm and ash. We can see that ash gives better estimates than apeglm on average. From the improvement histograms, we can see that ash gives improvement scores that are more extreme/hit-or-miss (which makes sense conisderring ash is giving bigger shrinkage)
## mean apeglm improvement for shrunk genes
## [1] 0.09190745
## mean ash improvement for shrunk genes
## [1] 0.1056943
## quantiles of improvement with shrunk genes for apeglm (top) and ash (bottom)
## 0.5% 1% 2.5% 5% 10% 25%
## -0.6699351 -0.5677980 -0.4046308 -0.3329838 -0.2404954 -0.1363746
## 50% 75% 90% 95% 97.5% 99%
## 0.1141548 0.2309577 0.3708697 0.4986104 0.7179197 1.5545750
## 99.5%
## 2.7041890
## 0.5% 1% 2.5% 5% 10% 25%
## -0.8275969 -0.5967537 -0.4862418 -0.3515606 -0.2600099 -0.1328322
## 50% 75% 90% 95% 97.5% 99%
## 0.1110962 0.2312324 0.4434070 0.6688729 1.0230625 1.6982119
## 99.5%
## 2.4236468
## Mean absolute error
## $MAEmle
## [1] 0.1556092
##
## $MAEmap
## [1] 0.1446702
##
## $MAEash
## [1] 0.143204
## MAE and MAD for genes shrunk by apeglm (recall we define shrunk as >0.1 shrinkage)
## $MAEmle
## [1] 0.5114665
##
## $MAEmap
## [1] 0.4195591
##
## $MAEash
## [1] 0.405118
## $MADmle
## [1] 0.3783456
##
## $MADmap
## [1] 0.3219125
##
## $MADash
## [1] 0.3135298
## MAE and MAD for genes shrunk by ash
## $MAEmle
## [1] 0.4763569
##
## $MAEmap
## [1] 0.3850236
##
## $MAEash
## [1] 0.3706626
## $MADmle
## [1] 0.331633
##
## $MADmap
## [1] 0.2807853
##
## $MADash
## [1] 0.2763752
All three estimatation procedures give confidence intervals with roughly the same width and roughly control for the specified confidence level.
## proportion of times true effects were in the 95% confidence intervals for MLE, MAP and ash (in that order)
## [1] 0.9463594
## [1] 0.9443198
## [1] 0.9438099
## summary statistics for standard errors of MLE, MAP and ash estimates (in that order)
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 0.06452 0.09190 0.12179 0.19481 0.20534 15.00000
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 0.06432 0.09154 0.12092 0.17781 0.20137 1.99451
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 0.06252 0.09121 0.12063 0.17483 0.20149 0.99438
We define a gene’s total counts as the sum of the gene’s counts across samples.
We wish to determine the filtering rule such that the MLE is most accurate with regard to CAT plot. We look at three rules: removing genes where less than half the samples have a certain count (which we call the threshold), removing genes where less than all the samples have a certain count (threshold), and removing genes where the totl counts are less than a certain amount (threshold). For each rule, we look at various different thresholds, and calculate the “average MLE concordance” that results when filtering with that threshold. The average MLE concordance is just taking the concordance when looking at top 10 genes, top 20 genes, etc., and averaging. We chose the rule that has the best concordance, and if multiple rules have similar concordance, we chose the rule that has the smallest number of genes removed. Across all rules and thresholds, filtering out genes with total counts less than 610 is the best. Using this rule, MLE with filtering is neck-and-neck with ash in the CAT plot, and better than apeglm.
## Mean absolute error
## $MAEmle
## [1] 0.1556092
##
## $MAEmap
## [1] 0.1446702
##
## $MAEash
## [1] 0.143204
## MAE for apeglm-shrunk genes (shrinkage>0.1)
## $MAEmle
## [1] 0.5114665
##
## $MAEmap
## [1] 0.4195591
##
## $MAEash
## [1] 0.405118
## MAE for ash-shrunk genes (shrinkage>0.1)
## $MAEmle
## [1] 0.4763569
##
## $MAEmap
## [1] 0.3850236
##
## $MAEash
## [1] 0.3706626
## MAE for MLE with filtering
## [1] 0.1012654
We try simulating \(\beta_i\) from \(t_3/10\) so that we have mostly close-to-zero effects (but with some relatively large effects occasionally appearing). Furthermore \(\phi_i\) from \(\text{Exp}(1/87)\) with probability \(0.5\) and 500 with probability \(0.5\). The distribution of overdispersion from the real data was approximated by a similar mixture distribution: one component was \(\text{Exp}(1/196)\) with 30% proportion, and the other component was a point mass at 500 with 70% proportion. Therefore, our simulated overdispersion will lead to nosier data.
## quantiles of simulated beta
## 0.5% 1% 2.5% 5% 10%
## -0.5882801928 -0.4555905862 -0.3203294299 -0.2405379957 -0.1654734256
## 25% 50% 75% 90% 95%
## -0.0773528447 -0.0003879518 0.0756327243 0.1622160960 0.2349559299
## 97.5% 99% 99.5%
## 0.3267021659 0.4757646527 0.6302828838
## largest 20 values of beta in absolute value
## [1] 4.498880 2.699080 2.044930 1.867515 1.449673 1.377370
In this case, no amount of filtering leads to CAT as good as apeglm and ash. Apeglm had an average concordance of close to 0.5.
False large effects were shrunk, but one truly large effects was also incorrectly shrunk torwards zero by apeglm. Shrinkage occurs more often, more severely and across different count sizes. Again, ash has better concordance with top genes and smaller mean absolute error.
Quantiles of shrinkage describes the large shrinkage for this dataset. Ash and apeglm both correctly deduce that the majority of true effects are null.
## Quantiles of apeglm shrinkage
## 50% 75% 90% 95% 97.5% 99%
## 0.05978246 0.15259613 0.33710362 0.51637194 0.70018592 1.01845232
## 99.5%
## 1.27848270
## Quantiles of ash shrinkage
## 50% 75% 90% 95% 97.5% 99%
## 0.07305032 0.17421984 0.35462841 0.53064723 0.71947012 1.02534066
## 99.5%
## 1.29288650
## MAE
## $MAEmle
## [1] 0.1915662
##
## $MAEmap
## [1] 0.09725696
##
## $MAEash
## [1] 0.09053847
## MAE for genes shrunk by apeglm (i.e. genes where apeglm shrinkage>0.1)
## $MAEmle
## [1] 0.36186
##
## $MAEmap
## [1] 0.1284094
##
## $MAEash
## [1] 0.115935
## MAE for genes shrunk by ash
## $MAEmle
## [1] 0.3447996
##
## $MAEmap
## [1] 0.1322552
##
## $MAEash
## [1] 0.1179926
We tried simulating standard normal effects in 5 vs. 5 sampling. Output is hidden as results are the same as the 4 vs. 4 normal effects case: ash gives more shribkage, better estimates on average, and higher concordance. Filtering did not lead to universally better concordance than ash (though it beat apeglm).
We simulated standard normal effects in a 3 vs. 3 design as well. Most output is hidden as results are similar to the 4 vs. 4 normal effects case and 5 vs. 5 normal effects case. The only differences are that filtering fails to beat ash (but it still beats apelgm) and more truly large effects are incorrectly and overly shrunk to zero by ash but not overly shrunk by apeglm.